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We  analyze  a  variant  of  the  whereabouts  search  problem,  in  which  a  searcher  looks  for  a  target  hiding  in  one  of  n  possible 
locations.  Unlike  in  the  classic  version,  our  searcher  does  not  pursue  the  target  by  actively  moving  from  one  location  to  the 
next.  Instead,  the  searcher  receives  a  stream  of  intelligence  about  the  location  of  the  target.  At  any  time,  the  searcher  can 
engage  the  location  he  thinks  contains  the  target  or  wait  for  more  intelligence.  The  searcher  incurs  costs  when  he  engages  the 
wrong  location,  based  on  insufficient  intelligence,  or  waits  too  long  in  the  hopes  of  gaining  better  situational  awareness,  which 
allows  the  target  to  either  execute  his  plot  or  disappear.  We  formulate  the  searcher’s  decision  as  an  optimal  stopping  problem 
and  establish  conditions  for  optimally  executing  this  search-and-interdict  mission. 
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1.  Introduction 

Operation  Neptune  Spear  led  to  the  capture  and  elimination  of 
Osama  bin  Laden  by  the  United  States  in  2011.  Although  U.S. 
intelligence  agencies  had  continuously  collected  information 
regarding  his  whereabouts,  the  dilemma  was  when  to  act. 
Raiding  a  wrong  location,  based  on  insufficient  or  false 
information,  would  cause  collateral  damage,  diplomatic 
blowback,  and  loss  of  intelligence  assets.  On  the  other  hand, 
waiting  too  long  for  more  information  could  result  in  bin 
Laden  escaping.  The  dilemma  between  “act  now”  or  “wait 
and  see”  was  acute  but  fortunately  was  resolved  successfully 
in  this  case.  Another  example  of  such  a  dilemma  concerns 
a  “ticking  bomb”  scenario  (Kaplan  2012).  In  this  scenario, 
a  hiding  terrorist  plots  to  attack  a  target  (e.g.,  a  suicide 
bomber),  and  the  authorities  must  race  to  stop  the  attack. 
A  final  example  involves  an  operation  to  rescue  hostages 
held  by  an  insurgency  group.  The  insurgents  may  kill  the 
hostages  (e.g.,  in  an  escape  attempt)  if  the  authorities  delay 
the  operation  for  too  long.  However,  a  failed  rescue  attempt 
may  alert  the  insurgents,  resulting  in  the  deaths  of  the 
hostages.  Many  military,  law  enforcement,  and  intelligence 
investigations  face  a  similar  trade-off  decision  concerning 
timing  and  cost  of  premature  action. 

Motivated  by  the  aforementioned  examples,  we  consider 
a  search  situation  called  the  whereabouts  search  problem 
(Kadane  1971,  Stone  1975).  In  its  simplest  form,  a  target 
lies  hidden  in  one  of  n  cells,  where  p!  is  the  probability 
that  the  target  resides  in  cell  i,  Pi  =  1 ,  and  c,  is  the 


cost  of  searching  cell  i.  The  searcher  examines  one  cell  at  a 
time  and  the  search  is  error  free;  if  a  cell  contains  the  target, 
the  searcher  will  detect  it.  The  objective  is  to  find  a  search 
strategy — an  order  in  which  to  search  the  cells — to  minimize 
the  expected  total  search  cost.  Several  variations  of  this 
problem  include,  among  others,  situations  where  a  search  is 
subject  to  error  (Kress  et  al.  2008,  Wilson  et  al.  2011);  the 
target  moves  (Komiya  et  al.  2006)  or  acts  strategically  (An 
et  al.  2013);  and  multiple  targets  arrive  and  disappear  in  a 
random  fashion  (Szechtman  et  al.  2008).  However,  all  of  the 
aforementioned  cases  share  the  same  definition  of  a  strategy, 
namely,  a  search  sequence  for  an  active  searcher. 

In  this  paper,  we  consider  the  same  physical  description 
of  the  whereabouts  problem:  a  single  static  target  hidden  in 
one  of  n  cells.  However,  the  operational  setting  is  different 
in  two  major  aspects:  (a)  the  searcher  does  not  actively 
search  the  cells  but  instead  relies  on  occasional  pieces  of 
intelligence  of  the  form  “cell  i  contains  the  target,”  and 
(b)  the  search  mission  is  time  critical.  The  searcher  does  not 
control  the  arrival  rate  of  intelligence,  and  an  intelligence 
item  may  be  wrong.  At  a  certain  point  the  searcher  may 
choose  a  cell  to  engage  in  the  hope  of  interdicting  the  target. 
If  the  searcher  chooses  the  wrong  cell,  he  incurs  a  cost 
comprising  collateral  damage,  loss  of  intelligence  assets, 
political  ramifications,  etc. 

We  describe  the  problem  in  §2  and  formulate  the  mathe¬ 
matical  model  in  §3.  The  cases  of  n  —  2  and  n  =  o o  appear 
in  §§4  and  5,  respectively.  Section  6  examines  the  optimal 
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strategy  when  2  <  n  <  oo.  We  present  numerical  illustrations 
in  §7.  Section  8  discusses  extensions.  All  proofs  appear  in 
the  online  appendix  (available  as  supplemental  material  at 
http://dx.doi.org/10.1287/opre.2016.1488). 

2.  The  Problem 

A  searcher  wants  to  interdict  a  target,  residing  in  one  of 
n  possible  cells,  before  some  event  occurs.  Such  an  event 
would  be,  for  example,  the  disappearance  of  bin  Laden  from 
a  certain  region  or  an  execution  of  a  terror  plot,  which  we 
use  as  our  reference  scenario.  An  attack  occurs  when  the 
plot  fully  matures,  and  the  plotting  time  is  exponentially 
distributed  with  mean  1  / /x  (a  similar  assumption  appears  in 
Kaplan  2010).  Although  the  searcher  may  have  some  initial 
notion  regarding  the  target’s  location  based  on  exogenous 
intelligence,  we  will  often  focus  on  the  case  where  there  is 
none:  the  uniform  prior  location  distribution. 

Independent  intelligence  items  from  human  informants, 
intercepted  communications,  and  interrogations  of  the  form 
“cell  i  contains  the  target”  arrive  according  to  a  Poisson 
process  with  rate  A.  The  searcher  has  no  control  over  the 
timing  or  content  of  the  items.  Thus,  scheduled  sensor  cues 
(e.g.,  RADAR,  SONAR,  images,  videos)  from  cells  do  not 
apply  here.  Although  our  model  applies  to  a  variety  of 
intelligence  sources,  we  use,  as  a  reference  setting,  human 
informants  who  provide  tips.  For  most  of  our  analysis  the 
parameters  /jl  and  A  only  appear  via  the  intensity  ratio 

p  —  m/a. 

If,  following  a  certain  number  of  tips,  the  searcher  decides 
to  engage  a  specific  cell,  the  search  ends,  even  if  the  searcher 
chooses  incorrectly.  If  the  searcher  engages  the  correct  cell, 
the  target  is  interdicted.  However,  if  the  searcher  engages  the 
wrong  cell,  then  the  target  realizes  that  he  is  being  hunted 
and  therefore  immediately  executes  his  (not  fully  mature) 
plot  before  the  searcher  finds  him.  In  §8.1  we  consider  a 
variant  where  the  target  only  executes  mature  plots  and  the 
searcher  continues  obtaining  intelligence  and  engaging  cells 
until  he  either  finds  the  target  or  the  target  attacks. 

The  searcher  desires  to  minimize  the  expected  cost  of  two 
possible  negative  outcomes:  (a)  engaging  a  wrong  cell  or  (b) 
execution  of  a  mature  attack  by  the  target.  The  costs  of  (a) 
and  (b)  are  c  and  d,  respectively.  The  false  positive  cost 
c  comprises  collateral  damage  resulting  from  engaging  an 
innocent  cell  and  the  (possible)  cost  of  a  premature  attack. 
We  neither  need  nor  make  any  assumption  regarding  the 
relative  values  of  c  and  d.  Because  the  results  to  follow 
only  depend  on  the  cost-ratio  a  —  d/c,  we  assume  without 
loss  of  generality  that  c—  1  and  d  =  a. 

A  tip  specifies  the  correct  cell  with  probability  q.  We 
often  refer  to  q  as  the  informant’s  reliability.  Informants 
are  neither  clueless  nor  malevolent;  that  is,  q  >  1  / n.  If  the 
informant  provides  an  incorrect  tip  (with  probability  1  —  q), 
then  the  error  is  uniform;  the  informant  specifies  each  one 
of  the  n  —  1  incorrect  cells  with  equal  probability. 

The  question  is  when  should  the  searcher  engage  a  cell ? 
We  have  here  a  “race”  between  the  flow  of  tips  and  the 


time  of  attack.  On  the  one  hand,  the  searcher  wants  to 
receive  as  many  tips  as  possible  to  reduce  his  uncertainty 
about  the  target’s  location.  On  the  other  hand,  this  “wait  and 
see”  approach  may  lead  to  the  target  attacking  before  the 
searcher  has  the  chance  to  do  so.  If  the  searcher  instead 
rushes  to  engage  a  cell,  the  likelihood  of  a  false  positive 
error  increases.  The  searcher  knows  the  values  of  all  the 
parameters  involved  in  this  process:  n,  q,  a,  and  p. 

This  search  problem  is  an  example  of  an  optimal  stopping 
problem  (Chow  et  al.  1971,  Shiryaev  2007,  Ferguson  2004). 
Wald  and  Wolfowitz  (1948)  examine  a  similar  problem  in 
their  work  on  the  sequential  probability  ratio  test.  They 
show  that  the  decision  between  selecting  a  hypothesis  and 
receiving  another  observation  is  optimally  determined  by 
a  threshold  policy.  In  our  model,  when  n  =  2  cells,  we 
find  a  similar  threshold  result  (see  §4),  which  does  not 
hold  for  n  >  2.  For  n  >  2,  our  problem  can  be  framed 
as  a  higher  dimensional  stopping  problem.  Lange  (2012) 
examines  optimal  stopping  of  an  n-dimensional  Brownian 
motion  and  shows  that  the  continuation  region  is  generally 
also  n-dimensional.  Although  standard  one-dimensional 
techniques  do  not  apply,  he  shows  that  the  continuation 
region  can  be  found  by  reformulating  the  problem  as  a 
free-boundary  problem  in  n  dimensions. 

When  n  >  2  cells,  our  problem  relates  to  the  family  of 
multinomial  selection  problems  (Kim  and  Nelson  2006) 
in  which  an  observation  specifies  the  “winner”  among  n 
competing  alternatives.  A  decision  maker  may  either  observe 
a  fixed  number  of  samples  before  choosing  the  best  option 
(Bechhofer  et  al.  1959)  or  may  dynamically  decide,  after  each 
observation,  whether  to  pick  an  alternative  or  receive  another 
observation  (Ramey  and  Alam  1979).  Most  formulations 
desire  to  achieve  a  lower  bound  on  the  probability  of 
choosing  the  correct  alternative,  provided  certain  conditions 
about  the  system  hold.  These  conditions  usually  relate  to  the 
relationship  between  the  true  probabilities  of  the  two  best 
alternatives  (Chen  1988).  A  good  survey  of  the  techniques 
used  in  multinomial  selection  problems  appears  in  Vieira 
et  al.  (2014).  Most  selection  problems  assume  a  deterministic 
number  of  observations.  In  our  problem  the  number  of  tips 
is  random  because  the  time  until  the  plot  matures  is  random. 
We  found  only  two  multinomial  selection  papers  that  examine 
a  random  maximum  number  of  observations  (Frazier  and  Yu 
2007,  Dayanik  and  Yu  2013).  The  model  in  Frazier  and  Yu 
(2007)  considers  only  the  n  —  2  case  and  allows  for  a  general 
stochastic  deadline,  which  is  analogous  to  the  time  until 
the  attack  occurs  in  our  model.  The  approach  in  Dayanik 
and  Yu  (2013)  does  allow  for  n  >2  alternatives.  It  focuses 
on  neuroscience  applications  and  considers  a  cost-rate,  as 
opposed  to  total  cost  in  our  model. 

Finally,  note  that  our  model  has  one  decision  maker, 
the  searcher.  One  could  view  the  problem  as  having  three 
strategic  players:  the  searcher,  the  target,  and  the  informant. 
We  consider  here  a  simpler  yet,  we  believe,  realistic  situa¬ 
tion  where  the  target  does  not  really  know  the  searcher’s 
operational  options  and  the  informant  is  incentivized  by 
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the  searcher  to  do  the  best  he  can.  One  could  develop  a 
two-player  Markov  game  between  the  searcher  and  target 
similar  to  the  Inspection  Game  (see  Chapter  4  of  Washburn 
2014).  However,  the  formulation  would  quickly  become 
unwieldy  because  one  would  need  to  specify  not  only  the 
intelligence  picture  of  each  player  but  also  each  player’s 
perceived  intelligence  picture. 


3.  Mathematical  Preliminaries 

The  decision  to  engage  a  cell  or  wait  for  more  tips  depends 
on  the  expected  cost  of  each  option.  In  this  section  we 
develop  the  mathematical  building  blocks  to  compute  these 
expected  costs.  Two  factors  determining  the  expected  costs 
are  Location  probability,  which  specifies  the  likelihood  that 
cell  i  contains  the  target,  and  Pointing  probability,  which 
specifies  the  likelihood  that  the  next  tip  points  at  cell  i.  In 
§3.1  we  compute  these  probabilities,  and  in  §3.2  we  use 
these  probabilities  to  derive  the  expected  costs. 


3.1.  Location  and  Pointing  Probabilities 

Let  p  =  {p\, . . . ,  Pn)  denote  the  current  location  probabilities 
and  let  p  denote  the  initial  location  probabilities  before  the 
first  tip.  Let  si  be  the  number  of  tips  thus  far  specifying  cell  i 
as  the  target’s  location,  and  s  =  (slt . . . ,  sn).  In  this  subsection 
we  assume  that  sl  ^  sn .  The  location  probability  of 

cell  i  given  s  is 


Pi(s)  =  P[target  in  i  |  ,sj  = 


P[s  |  target  in  i]pi 
Ej=,  P[s  I  target  in  j]pj ' 


(1) 


An  informant  points  to  the  correct  cell  with  probability  q 
and  a  specific  incorrect  cell  with  probability  (1  —  q)/(n  —  1). 
Thus,  utilizing  the  multinomial  nature  of  s,  we  have 


PM  target  in  i  \ 


(£***)! 

r W  q  \n-  \) 

(E^-)!/l  —  q\ELlSt 
n*v  v « — i  / 


( (i -<?)/(«-!)) 
(E^)!/l  —  q\ELlSt  „ 

n*v  v « — 1  / 


where 


q 


(i  -q)/(n- 1) 


(2) 

(3) 


Note  that  only  the  ys‘  portion  of  (2)  depends  on  i.  This  is  a 
direct  consequence  of  our  assumption  that  each  wrong  cell 
is  equally  likely  to  be  pointed  at.  When  we  substitute  (2) 
back  into  (1),  most  terms  cancel,  and  the  location  probability 
simplifies  to 


P;  0) 


Y‘Pi 

E"=i  ySjPj ' 


(4) 


Note  from  Equation  (4)  that  p;(s)  is  invariant  to  additive 
shifts  in  s.  If  s  is  such  that  ,v,  =  s;  +  L  for  some  integer  L, 
then  p,(s)  =  Pi(s).  Specifically,  if  we  set  L  =  —sn  —  —  min(s) 
and  use  st  —  st  —  sn,  then  we  can  write  =  E"I,‘  A  ■,  where 
A  ■  =  Sj  —  S:+l  ^  0.  Therefore,  pf(s)  is  uniquely  determined 
by  the  tip-differentials  A  ■,  j  —  1 1. 

Although  s  or  A  are  natural  state  vectors,  it  is  simpler 
to  use  the  location  probabilities  p  —  (pt , . . . ,  pn)  as  the 
state  vector  for  most  of  the  mathematical  analysis  in  §§4-6. 
Specifically,  if  the  next  tip  points  to  cell  i,  then  the  updated 
probability  pf1  for  cell  j  is 


yPi 

yPi  +  (1  -  Pi) 

Pj 

ypt  +  i^-Pi) 


if  j  =  i 


if  j  +  i. 


(5) 


Recall  that  according  to  our  assumption  q  >  l/n  and  there¬ 
fore  y  >  1 .  Consequently,  a  tip  pointing  to  cell  i  increases 
the  posterior  location  probability  of  cell  i  (p\  n  ^  p,) 
and  decreases  the  posterior  probability  of  other  cells  (pf}  ^ 
Pj  for  j  i-  i ). 

We  next  define  B(p)  as  the  set  of  cells  with  the  highest 
location  probability: 

B{p )  =  | i:  Pi  =  max. pj,  1  <  i  <  nj.  (6) 

The  following  proposition  defines  a  lower  bound  on  max  p  . 
The  proof  appears  in  Appendix  A. 

Proposition  1.  If\B(p)  \  =  1  and  the  prior  distribution  for 
the  target’s  location  is  uniform,  then  max;  pj  ^  q. 

Next  we  consider  the  pointing  probability  r,  (p)  that 
the  next  tip  points  to  cell  i,  given  the  current  location 
probabilities  p: 

r,  (p)  =  P[informant  says  i  \  p\ 

n 

=  P[informant  says  i  \  p,  target  in  k  ] 

r=i 

•  P  [target  in  k  \  p] 

=  qPi  +  - — 7  Y2pk  =  <iPi  +  - — 7  (f-p,)-  (7) 

n  -  1  k+i  n-  1 

Inspection  of  (7)  reveals  that  rfp)  e  [(1  —  q)/{n  —  1),  q\. 
Thus,  a  tip  may  point  at  a  cell  other  than  i,  even  if  p,  is 
close  to  1,  if  7  <7  I .  Note  also  that  ij  ( p )  only  depends 
on  pr  it  does  not  depend  upon  how  the  remaining  (1  —  pf 
probability  mass  is  spread  among  the  other  n  —  1  cells. 

3.2.  Expected  Cost 

Define  C(p)  as  the  expected  cost  if  the  searcher  acts  optimally 
in  state  p.  Since  an  optimal  stopping  problem  is  a  dynamic 
programming  problem  (Chow  et  al.  1971),  we  compute  C{p) 
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by  comparing  the  expected  costs  of  two  decisions:  engage 
or  wait.  That  is, 

C  ( p )  =  min  ^expected  cost  if  the  searcher  engages  a  cell, 

P  1 

- a-\ - expected  cost  after 

1 +  p  1 +p 

receiving  the  next  tip^ .  (8) 

If  the  searcher  decides  to  wait,  the  target  may  attack  before 
the  searcher  receives  the  next  tip.  In  that  case,  which  happens 
with  probability  p/(l  +p),  the  mature  attack  produces  a 
cost  of  a.  If  the  next  tip  arrives  before  the  target’s  attack, 
the  system  transitions,  and  we  assume  the  searcher  behaves 
optimally  in  the  future.  Next  we  compute  the  expected  costs 
of  the  two  possible  options:  engage  or  wait. 

If  the  searcher  decides  to  engage  cell  j  while  in  state  p, 
the  expected  cost  is  1  —  p;-.  Obviously,  the  searcher  should 
engage  a  cell  in  B(p );  the  searcher  can  use  any  tie-breaking 
mechanism  if  B(p)  contains  multiple  cells.  To  simplify 
notation,  we  henceforth  assume  without  loss  of  generality 
that  p,  ^  ^  pn.  Therefore,  B(p)  contains  cell  1  and 

E[Cost  if  searcher  decides  to  engage  |  p] 

=  1  —  max  p  =  1  —  p ,  -  (9) 

j 

If  the  searcher  decides  to  wait,  and  an  informant  next 
points  to  cell  i,  then  p  transitions  to  p!  "l)  according  to 
Equation  (5).  The  informant  points  to  cell  i  with  probability 
r;(p),  and  the  searcher  will  incur  an  expected  cost  of  C(p(+'*) 
if  this  occurs.  Putting  these  pieces  together,  we  have 

E[Cost  if  waiting  for  and  receiving  the  next  tip  |  p] 

n 

=  [informant  says  i  \  p]C(p 

i=i 

=  X>,(p)c(p(+i)).  (io) 

i=  1 

Moving  to  the  general  case,  we  combine  Equations  (8), 
(9),  and  (10)  to  produce  the  complete  cost  function: 

C(p)  =  m\n(\  —  p,,  ^p(p)C(p(+,))Y  (11) 

V  1+P  1 +P~l  J 

If  the  searcher  is  indifferent  between  engaging  and  waiting, 
he  engages.  In  Appendix  B  we  present  characteristics  of 
C(p),  such  as  its  concavity.  Because  most  of  these  results 
are  fairly  intuitive  (e.g.,  C(p)  decreases  if  the  informant  next 
points  to  cell  1),  we  defer  this  discussion  to  the  appendix. 

4.  The  Case  of  Two  Cells 

Arguably,  the  simpler  the  form  of  the  optimal  policy,  the 
more  attractive  it  is  operationally.  One  such  simple  form 
is  a  threshold  policy:  the  searcher  engages  if  and  only  if 
px  ^  t  for  some  threshold  r  (recall  we  assume  that  px  ^  p2). 
The  next  corollary  follows  from  the  convexity  of  the  engage 
region  (see  Proposition  EC. 2  in  Appendix  B). 


Corollary  1 .  For  n  =  2,  the  searcher  should  engage  if 
and  only  if  px  ^  t  for  some  threshold  t  e  [0.5,  1). 

We  prove  this  corollary  in  Appendix  C.  While  there  is 
an  explicit  expression  for  the  threshold  r,  its  derivation  is 
cumbersome  and  therefore  we  defer  most  of  its  details  to 
Appendix  D.  A  necessary  and  sufficient  condition  to  engage 
in  all  states  (i.e.,  r  =  0.5)  is 


1  p  ,  1 

2  1  -bp  1  H ~  p 


(12) 


If  condition  (12)  does  not  hold,  then  r  >  0.5.  See  Appendix  D 
for  the  general  expression  for  r  when  r  >  0.5.  The  impli¬ 
cation  is  straightforward;  if  damage  from  a  mature  attack 
exceeds  the  false  positive  cost  (a  ^  1 )  and  the  informant 
has  low  reliability  ( q  ~  0.5),  the  searcher  should  always 
engage.  The  benefits  from  future  tips  are  small,  and  the  risk 
of  waiting  is  high. 

To  derive  r  we  leverage  off  the  rich  results  related  to 
the  gambler’s  ruin  problem.  Denote  p  as  the  prior  state 
before  the  arrival  of  the  Sj-I-  s,  tips.  Using  Equation  (4)  we 
transform  p  to  p: 


Y'P\ 

ySl  ~S2Pi 

(13) 

Pl  ySlPi  +  yS2(i  —  Pi) 

rt>-l2pl  +  (i-Pi) 

tS2(1-a) 

i  -Pi 

(14) 

Pi  ySiFi  +  r2(i-Pi) 

ysi~s2pl  +  (1  Pi) 

To  update  the  probabilities  we  only  need  to  know  the  tip- 
differential  .v,  —  ,s2.  We  model  A  =  sx  —  s2  as  a  random 
walk.  For  a  given  prior  p,  we  can  transform  the  threshold 
policy  from  the  real  number  r  to  two  nonnegative  integers 
A(p,  t)  and  B(p,  t)  such  that  the  searcher  waits  as  long  as 
—B(p.  r)  <  A  <  A(p,  t).  If  A  first  hits  A(p,  r)  (— B(p ,  t)), 
the  searcher  engages  cell  1  (cell  2).  This  approach  facilitates 
the  use  of  gambler’s  ruin  machinery  to  compute  relevant 
parameters  (see  Appendix  D  for  details). 

It  is  difficult  to  gain  much  insight  about  the  optimal 
threshold  r  using  purely  analytic  approaches.  Thus,  we 
illustrate  its  behavior  using  several  figures.  Figure  1  presents 
how  the  threshold  r  varies  with  informant  reliability  q  for 
fixed  cost-ratio  a  and  intensity-ratio  p.  As  we  move  from 
Figures  1(a)  to  1(c),  we  increase  a  from  0.5  to  2.  Each  curve 
on  a  figure  corresponds  to  a  fixed  value  of  p  e  [0.01, 0.1,  1}. 
The  threshold  r  is  a  nondecreasing  function  of  q.  A  more 
reliable  informant  reduces  the  engage  region  and  makes  the 
searcher  more  likely  to  wait  because  future  tips  are  more 
valuable.  The  threshold  decreases  as  we  increase  either  a 
(mature  attack  becomes  more  costly)  or  p  (mature  attack 
becomes  more  imminent)  and  hence  the  engage  region 
expands.  In  particular,  in  some  situations  with  large  a  and/or 
large  p,  the  searcher  immediately  engages  regardless  of  the 
current  state  p  or  informant  reliability  q. 

An  interesting  phenomenon  relates  to  the  expected  number 
of  tips  received  by  the  searcher  when  acting  optimally.  One 
would  expect  that  this  number  will  decrease  as  the  informant 
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Figure  1.  Engage  threshold  i  as  a  function  of  q  for  fixed  combinations  of  p  e  {0.01, 0.1,  1}  and  a  e  {0.5,  1, 2}. 


(a)  a  =  0.5 


q 


(b)  a  =  1 


q 


(C)  a  =  2 


q 


becomes  more  reliable  and  therefore  the  searcher  can  reach 
the  engage  decision  faster.  Figure  2  demonstrates  that  this  is 
not  always  the  case.  See  Appendix  E  for  the  derivation  of  the 
expected  number  of  tips.  Assuming  the  search  starts  in  the 
uniform  state  p  =  (0.5,  0.5),  Figures  2(b)  and  2(c)  show  that 
if  p  is  small  (the  inflow  rate  of  tips  is  much  larger  than  the 
attack  rate)  it  is  possible  that  the  expected  number  of  tips 
actually  increases  with  q  when  the  latter  is  small  enough. 
This  nonmonotonicity  results  from  two  conflicting  factors. 
On  one  hand,  as  q  increases  the  threshold  increases  (see 
Figure  1),  which  suggests  that  the  searcher  may  need  more 
tips  to  reach  the  threshold.  On  the  other  hand,  a  larger  q 
implies  that  the  informant  will  point  to  the  correct  cell  more 
frequently,  which  suggests  that  the  searcher  will  reach  the 
threshold  following  fewer  tips.  Specifically,  for  q  1,  the 
searcher  will  only  need  one  tip.  In  general,  the  first  or  second 
factor  may  dominate  depending  upon  the  values  of  a,  p,  and 
q.  In  most  cases,  when  p  is  relatively  large,  the  imminent 
attack  dictates  a  swift  action  by  the  searcher,  as  shown  in  the 
dashed  and  -o-  curves,  which  are  close  to  zero. 

The  jumps  in  Figure  2  occur  when  the  optimal  tip- 
differential  changes  by  one.  For  a  fixed  optimal  tip- 
differential,  the  expected  number  of  tips  decreases  as  q 
increases  because  a  more  reliable  informant  will  produce 


a  stream  of  tips  that  reaches  that  tip-differential  faster 
(probabilistically)  than  a  less  reliable  informant. 

5.  The  Case  of  an  Infinite  Number  of  Cells 

When  n  is  very  large  and  the  cells  are  equally  likely  to 
contain  the  target,  it  is  unlikely  that  the  informant  will 
point  to  the  same  incorrect  cell  twice.  Thus,  a  second  tip 
to  the  same  cell  should  indicate  that  it  is  the  correct  one. 
In  Appendix  F.l  we  make  this  argument  more  rigorous.  If 
n  =  oo  and  the  informant  points  twice  to  the  same  cell,  then 
the  searcher  knows  with  certainty  that  this  cell  contains  the 
target.  We  refer  to  the  second  tip  to  the  same  cell  as  the 
confirming  tip.  In  Appendix  F.2  we  derive  the  optimal  policy, 
which  we  summarize  in  the  next  Proposition. 

Proposition  2.  The  searcher  will  choose  the  lowest  cost 
alternative  among  the  following  three  options 

1.  Immediately  engage  any  cell  before  receiving  the  first 
tip:  cost  is  1 

2.  Obtain  one  tip  and  engage  the  corresponding  cell:  cost 
is  (p/(l  +  p))«  +  (1/(1  +  p))(l  -  q)\ 

3.  Wait  for  the  confirming  tip  and  then  engage:  cost  is 

«(i  -(q/ip  +  q))2)- 


Figure  2.  Expected  number  of  tips,  starting  from  the  uniform  state  p  =  (0.5,  0.5)  until  the  search  ends  as  a  function  of  q 
for  fixed  combinations  of  p  e  {0.01, 0.1,  1}  and  a  e  {0.5,  1, 2}. 

(a)  a  =  0.5  (b)  a  =  1  (c)  a  =  2 


Note.  The  search  ends  either  when  the  searcher  engages  or  when  a  mature  attack  occurs. 
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Figure  3. 


Searcher  should  engage  for  (p,  a)  lying  above  solid  line,  wait  for  the  confirming  tip  if  (p,  a)  lies  below  the 
dashed  line,  and  engage  after  one  tip  for  situations  between  the  two  curves. 


(a)  £7  =  0.1  (b)  <7  =  0.8 


Thus,  the  searcher  should 

q 

choose  option  1  iff  a  >  1  H — , 

P 

q  q  —  p2  —  pq  —  q1 

choose  option  3  iff  a  <  l  -\ - , 

p  p  +  2q-q- 

choose  option  2  otherwise. 

The  searcher  causes  collateral  damage  if  he  chooses 
option  1  because  he  engages  the  wrong  cell.  The  cost  for 
option  2  follows  immediately  from  (11)  because  px=  q 
after  the  tip.  If  the  searcher  chooses  option  3,  there  is  no 
collateral  damage,  but  the  target  may  execute  the  attack 
before  the  confirming  tip  arrives. 

Figure  3  illustrates  what  the  searcher  should  do  for 
different  a,  p  pairs  for  q  e  {0.1, 0.8}.  The  searcher  chooses 
option  1  if  the  parameters  lie  above  the  solid  curve,  option  3 
if  the  parameters  lie  below  the  dashed  curve,  and  option 
2  otherwise.  The  searcher  is  more  likely  to  wait  for  the 
confirming  tip  for  small  a/ p  pairs  and  engage  for  large 
values.  Not  surprisingly  the  region  in  which  option  2  is 
optimal  increases  as  we  increase  q  because  one  tip  provides 
significant  information  for  larger  values  of  q. 

The  optimal  strategy  for  the  n  =  oo  case  suggests  a 
heuristic  for  n  <  oo,  where  the  searcher  chooses  among 
the  three  options  listed  in  Proposition  2.  We  compute  the 
finite-/?  costs  for  these  three  options  in  Appendix  F.3.1. 
Overall,  the  heuristic  performs  very  well  and  provides  near 
optimal  results  in  many  situations,  often  even  for  small  n. 
This  heuristic  generates  a  cost  within  1%  on  average  over 
many  scenarios  covering  a  variety  of  different  parameter 
combinations.  Unfortunately,  this  heuristic  only  applies  for 
the  uniform  state.  Appendices  F.3.2  and  J.  1  contain  more 
details  on  the  performance  of  this  heuristic. 

6.  Policy  for  2  <  n  <  oo 

Suppose  that  q  =  1.  In  this  case,  the  searcher  either  immedi¬ 
ately  engages  cell  1 ,  or  he  waits  for  the  first  tip  and  then 


engages  the  correct  cell.  In  the  former  the  expected  cost 
is  (1  —  p, ),  and  in  the  latter  it  is  (p/(  1  +  p))a.  Thus,  the 
searcher  should  engage  now  if  and  only  if 


Pi  > 


P 

1  +  p 


(1  —  a)  + 


1 

1  +  p' 


(15) 


Condition  (15)  is  sufficient  to  engage  for  any  value  of  q.  We 
derive  this  formally  in  §6.1.  This  observation  leads  to  the 
following  preliminary  analysis  for  the  case  where  q  <  1  and 
the  searcher  has  no  prior  information:  pl  =  ■  ■  ■  =  pn  =  l/n. 
In  that  case  the  searcher  engages  any  cell  before  receiving 
a  tip  if  l/n  ^  (p/(l  +  p))(l  —  a)  +  1/(1  +  p).  We  call 
this  situation  a  blind  engagement  because  the  searcher 
effectively  shoots  in  the  dark.  If  the  searcher  obtains  one  tip 
and  engages  the  corresponding  cell,  then  the  initial  state 
p  =  (l/n,  l/n,  . . .  l/n)  transitions  to  p(+1)  =  ( q ,  (1  —  q) / 
(n  —  1), ...  (1  —  q)/(n  —  1))  (see  Equation  (5))  and  the 
expected  cost  is  (p/(l  +  p))a  +  (1/(1  +  p))(l  —  q).  Thus,  if 
1  —  l/n>(p/(l  +  p))ol  +  (1/(1  +  p))(l  —  q),  the  searcher 
should  wait.  In  summary,  we  have 


blind  engagement,  (16) 
— >  wait.  (17) 

If  1  /n  falls  between  the  two  bounds,  additional  analysis 
is  needed.  Note  the  equivalence  between  condition  (17) 
and  the  two-cell  condition  in  (12).  Conditions  ( 1 6) — ( 17) 
suggest  that  if  n  is  small,  p  is  large  (an  imminent  attack  is 
likely),  and  a  is  large  (damage  from  a  mature  attack  exceeds 
the  false  positive  cost),  then  the  searcher  may  optimally 
choose  a  cell  uniformly  at  random  before  receiving  any  tips. 
Figure  4  presents  the  region  in  a,  p  space  where  the  searcher 
chooses  to  wait  rather  than  blindly  engage  (condition  (17)) 


1  p  ,  1 

if  -  >  — —  (l-a)+  — - 
n  1+p  1+p 


1  p  ,  1 

if  -  <  — —  (l-a)+— —  i 
n  1+p  1+p 
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Figure  4. 


For  the  uniform  state,  the  searcher  should  receive  at  least  one  tip  if  (p,  a)  lies  below  the  curve. 


(a)  q  =  0.4  (b)  q  =  0.8 


for  different  values  of  n  and  q.  The  wait  region  falls  below 
the  curves.  For  large  n  and  a  reliable  informant,  the  searcher 
will  wait  for  even  reasonably  large  values  of  a  and  p.  The 
curves  look  similar  to  those  in  Figure  3  for  the  n  —  oo  case. 
The  solid  curve  in  Figure  3  corresponds  to  the  thin  dashed 
curve  in  in  the  northeastern  portion  of  Figure  4,  which 
represents  the  limiting  case  as  n  — »■  oo. 

We  now  turn  to  the  general  nonuniform  state.  Unlike 
the  n  —  2  case,  there  is  no  threshold  policy  for  optimally 
responding  to  tips,  as  shown  in  the  next  example. 

Example  2:  let  q  =  0.3,  a  =  0.8,  p  =  1/9.  The  searcher 
should  engage  in  state  p  =  (0.316,  0.246,  0.246,  0.191) 
and  should  wait  in  state  p  =  (0.366,  0.366,  0.134,  0.134). 
However,  0.316  =  p,  <  pt  —  0.366. 

Example  2  suggests  that  the  key  factor  driving  the  decision 
lies  in  the  differential  between  the  two  cells  with  the  highest 
probability.  This  type  of  result  appears  in  many  algorithms 
used  for  multinomial  selection  problems  (Bechhofer  et  al. 
1959,  Ramey  and  Alam  1979,  Kim  and  Nelson  2006).  One 
might  propose  that  the  optimal  policy  takes  a  threshold 
form  based  on  p,  —  p2  or  Pi/p2-  Unfortunately,  the  next 
example  shows  a  threshold  policy  based  on  either  of  those 
two  quantities  is  not  optimal. 

Example  3:  Let  q  —  0.42,  a  =  0.5,  p  =  1.  The  searcher 
should  engage  in  state  p  =  (0.556,  0.384,  0.060)  but  should 
wait  in  state  p  =  (0.512,  0.244,  0.244). 

Our  state  space  {p  |  px  >  •  •  •  >  p„ ,  Yl'Ui  Pi  =  1 }  is  an  n  —  1 
dimensional  closed  convex  set,  and  thus  we  should  not  be 
surprised  that  the  optimal  policy  cannot  be  represented  by 
a  one-dimensional  subspace.  Because  the  optimal  policy 
does  not  take  on  a  simple  form,  we  next  present  sufficient 
conditions  to  engage  or  wait.  The  searcher  can  use  the 
conditions  in  this  section  as  the  basis  for  heuristic  policies. 
We  compare  these  heuristic  policies  to  the  optimal  policy  in 
§7.1  and  Appendix  J. 

We  derive  the  sufficient  conditions  by  computing  upper 
and  lower  bounds  on  the  value  of  the  second  term  of  the  cost 
function  C(p)  in  Equation  (11);  the  second  term  corresponds 


to  the  expected  cost  to  wait.  If  the  engage  value  1  —  p,  is  less 
than  or  equal  to  this  lower  bound,  then  the  searcher  should 
engage  in  state  p.  If  1  —  p,  exceeds  the  upper  bound,  then 
the  searcher  should  wait  in  state  p.  If  1  —  p ,  lies  between 
the  lower  bound  and  upper  bound  to  wait,  then  we  need 
to  perform  additional  analysis  or  derive  tighter  bounds  to 
determine  the  searcher’s  optimal  decision. 

We  defer  the  construction  of  the  upper  and  lower  bounds 
to  Appendix  G.  Rather  than  focus  on  the  general  structure 
of  the  bounds,  we  instead  present  several  specific  sufficient 
conditions  to  engage  or  wait  in  §§6.1  and  6.2,  respectively. 
These  conditions  converge  to  a  necessary  and  sufficient 
condition  to  engage  (see  Proposition  EC. 7  in  Appendix  G). 
This  allows  us  to  theoretically  approximate  C(p)  to  any 
desired  precision  and  determine  whether  the  searcher  should 
engage  or  wait  in  state  p.  The  computational  feasibility 
depends  upon  p  (see  (EC.lOO)-(EC.lOl)  in  Appendix  G). 
For  p  ^  0.1,  we  can  solve  for  C(p)  and  the  optimal  decision 
in  less  than  a  second  on  most  problems  on  a  standard  laptop 
for  n  ~  100.  However,  for  p  ^  0.01  the  calculations  can  bog 
down  or  become  intractable  for  n  ^10. 

6.1.  Sufficient  Conditions  to  Engage 

In  Appendix  H  we  present  several  sufficient  conditions  to 
engage,  including  a  family  of  conditions  that  converges  to  a 
necessary  and  sufficient  condition.  Here  we  focus  on  three 
conditions  to  engage  that  provide  insight  on  the  decision. 

For  our  first  bound  we  set  C(p(+l))  =  0  in  (11).  This 
assumes  that  the  searcher  knows  the  location  of  the  target 
with  certainty  after  receiving  one  tip.  This  best-case  scenario 
produces  a  lower  bound  on  the  optimal  cost  C(p)  and  yields 
condition  (15),  which  we  derived  earlier  by  assuming  q  =  1. 
Combining  Proposition  1  and  condition  (15)  produces  the 
following  sufficient  condition  to  engage: 

engage  if  q  ^  — 1 P—  (1  -  a)  +  — J— , 

1+p  1+p 

for  uniform  prior  and  \B(p)\  =  1.  (18) 
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If  condition  (18)  holds  for  the  uniform  prior  case,  then  the 
searcher  would  receive  at  most  one  tip  before  engaging 
cell  1. 

To  derive  a  tighter,  less  conservative,  sufficient  condition 
to  engage,  we  set  C(p(+,))  =  0  after  two  tips  in  (11)  (rather 
than  after  one  as  assumed  in  (15)).  In  Appendix  H.  1  we 
show  that  if  the  following  condition  holds,  then  the  searcher 
should  engage  cell  1. 

Pi  >  T““—  (!-«)+  T-J—  ( E  ri  (P)  ( maX  ( maX  Pfl)  > 
i+p  !+p\ti  V  \  j 

t ^(1-“)+nb))}  (19) 


1  or  (2)  wait  for  the  next  tip  and  then  engage.  Condition  (17) 
corresponds  to  the  myopic  policy  starting  from  the  uniform 
state.  More  generally,  if  the  searcher  uses  the  myopic  policy, 
he  will  engage  cell  1  if 

Pi  >  T^— Ema x(qpi,  - — yp^.  (21) 

1  +  p  1  +  P~?  \  n-\  ) 

See  Appendix  1.1  for  the  derivation  of  (21).  If  condition 
(21)  does  not  hold,  the  searcher  waits  until  the  next  tip  and 
then  repeats  the  comparison  between  the  two  options  using 
the  new  information  obtained  from  the  tip.  The  myopic 
condition  simplifies  in  two  special  cases  that  depend  upon 
the  max  term  in  (21): 


The  right-hand  side  of  (19),  which  depends  now,  through 
r,(p)  and  Pj,  on  q  is  always  smaller  than  the  right- 
hand  side  of  (15).  We  derive  (15)  from  (11)  by  assuming 
C(pi+,y)  =  0,  but  we  derive  (19)  from  (11)  by  assuming 

C(p(+,))  =  min(  1  —  max p\+'\  — — a  )  ^  0. 

V  J  1+P  / 

We  conclude  this  subsection  with  a  heuristic  based  on  the 
threshold  policy  for  the  two-cell  case,  where  cells  2,  3, ... ,  n 
are  combined  into  an  uber-cell.  Accordingly,  define  a  two¬ 
cell  state  p  such  that  p,  =  p,  and  p2  =  1  —  p,  =  Pi-  If 
the  searcher  chooses  to  engage  cell  1  when  compared  to  the 
uber-cell,  then  the  searcher  should  also  engage  cell  1  in  the 
n-cell  problem.  We  must  modify  q  when  moving  from  the  n- 
cell  problem  to  the  two-cell  problem  to  maintain  the  same  y, 
which  captures  informant  effectiveness  independent  of  n. 
Specifically,  define  q  =  y/(l  +  y),  where  y  applies  to  the 
original  n-cell  problem.  If  we  denote  r(q,  a,  p )  as  the  optimal 
threshold  for  the  two-cell  problem,  (see  Proposition  EC. 4  of 
Appendix  D),  then  we  have  the  following  condition: 

engage  if  p,  >  t (q,  a,  p).  (20) 

6.2.  Sufficient  Conditions  to  Wait 

Appendix  I  derives  conditions  to  wait  based  on  the  common 
heuristic  called  the  A-stage  look-ahead  rule.  The  searcher  can 
receive  at  most  k  additional  tips;  after  receiving  the  Ath  tip, 
the  searcher  must  engage.  Because  the  A-stage  look-ahead 
rule  restricts  the  searcher’s  strategy  space,  the  policy  will 
produce  an  upper  bound  on  the  cost  function  C(p).  Conse¬ 
quently,  if  the  A-stage  look-ahead  policy  recommends  to  wait, 
then  the  searcher  should  optimally  wait.  See  Chapter  5.1  of 
Ferguson  (2004)  or  7.4  of  Berger  (1985)  for  more  details 
on  the  A-stage  look-ahead  policy.  This  heuristic  transforms 
the  infinite  horizon  problem  of  solving  for  C(p)  in  (11)  to 
a  finite  horizon  problem.  For  small  values  of  A,  backward 
induction  provides  a  computationally  tractable  approach. 
The  A-stage  look-ahead  heuristic  usually  performs  well  in 
practice  (Ferguson  2004). 

We  now  focus  on  a  myopic  policy  where  A  =  1 .  In  this 
case  the  searcher  considers  just  two  options:  (1)  engage  cell 


Pi  > 


t~7 — (!  ~  a)+  77 — <7  if  Pi  <  7P, 
1+p  1+p 


1  —  a 


if  Pi  >  y Pi 


Vi 

(22) 

V  i  >  1 . 


The  first  case  in  (22)  occurs  when  the  max  expression  in 
(21)  always  returns  the  first  term.  This  situation  corresponds 
to  a  “roughly  uniform”  state  p;  whatever  cell  the  informant 
points  to  with  the  next  tip  will  become  a  best  candidate 
cell.  The  first  case  in  (22)  is  similar  to  the  condition  for  the 
optimal  threshold  in  the  two-cell  case  exceeding  0.5  (see 
Equation  (12)).  The  second  case  in  (22)  corresponds  to  the 
case  when  the  max  in  (21)  always  returns  the  second  term. 
This  occurs  when  cell  1  is  a  “strong”  best  candidate  cell; 
even  if  the  informant  points  to  cell  i  ^  1  with  the  next  tip, 
cell  1  remains  a  best  candidate  cell. 

If  p  1  (i.e.,  the  threat  is  imminent  and  tips  are  scarce) 
or  we  have  a  highly  reliable  informant  ( q  close  to  1),  the 
myopic  conditions  to  engage  in  (21)-(22)  closely  resemble 
the  sufficient  condition  to  engage  in  (15).  In  this  case,  the 
myopic  policy  produces  nearly  optimal  recommendations. 

The  first  part  of  condition  (22)  holds  for  the  uniform 
state  p  =  (1  /ri, ....  1/n)  and  corresponds  to  condition  (17). 
Following  one  tip  (pointing  at  cell  1)  the  system  transitions 
from  p  to  the  new  state  p,  where  px  —  q  and  p;  =  (1  —  q)/ 
{n  —  1)  for  i  >  1.  Therefore  the  second  part  of  condition  (22) 
holds  for  state  p.  Consequently  if  ( 1  —  q)  <  a  and  the  search 
starts  with  a  uniform  prior,  the  searcher  obtains  at  most 
one  tip  before  engaging  if  he  follows  the  myopic  policy. 
Specifically,  the  searcher  engages  cell  1  before  obtaining 
any  tips  if 


1  p  1 

?— (! -“)+?— 4- 
n  1+p  1+p 


Otherwise  the  searcher  engages  the  cell  provided  in  the  first 
tip  since  px  =  q  >  1  —  a. 


7.  Analysis 

Looking  at  some  representative  scenarios,  we  next  analyze 
results  from  §6.  Subsection  7.1  examines  the  three-cell  case 
and  in  §7.2  we  analyze  the  effect  of  number  of  cells  on  the 
expected  cost. 
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Figure  5.  Engage  region  for  q  e  {0.35, 0.55, 0.75, 0.95}  and  combinations  of  a  e  {0.5,  1.5}  and  p  e  {0.1,  1}. 

(a)  a  =  0.5  and  p  =  0.1  (b)  a  =  0.5  and  p  =  1 


(c)  a  =  1.5  and  p  =  0.1 


Note.  The  engage  region  lies  to  the  southeast  of  each  curve. 

7.1.  Three-Cell  Case 

Figure  5  illustrates  the  three-cell  engage  region  in  the 
P\  x  Pi  plane  for  pt  ^  p2  ^  p3  =  1  —  pt  —  p2-  The  thin 
dashed-line  triangle  outlines  the  feasible  px,p2  values.  Each 
subfigure  fixes  values  for  a  and  p  and  contains  four  curves 
for  q  e  {0.35,  0.55,  0.75,  0.95}.  The  southeast  area  of  the 
cone  corresponds  to  the  engage  region  of  the  state  space.  As 
discussed  in  the  introduction  of  §6,  a  threshold  policy  may 
not  be  optimal.  However,  in  many  cases  such  a  policy  may 
perform  well  based  on  the  vertical  nature  of  the  boundaries 
when,  for  example,  a  is  relatively  small  or  q  is  not  too 
small. 

Similarly  to  the  two-cell  case,  the  engage  region  decreases 
with  the  reliability  of  the  informant  because  the  benefit  from 
additional  tips  increases.  Larger  values  of  a  or  p  increase 
the  size  of  the  engage  region  because  the  cost  or  likelihood 
of  an  attack  increases.  For  larger  value  of  p  (Figures  5(b) 
and  5(d)),  the  boundaries  for  the  various  reliability  values 
are  closer  together  than  for  smaller  p  (Figures  5(a)  and  5(c)). 


(d)  a  =  1 .5  and  p  =  1 


The  informational  value  of  tips  for  smaller  p  is  greater 
than  for  larger  p,  and  therefore  the  reliability  has  a  greater 
impact.  The  wait  region  in  Figure  5(d)  is  empty  because  this 
situation  corresponds  to  a  blind  engagement  scenario  (see 
condition  (16)),  which  implies  the  searcher  will  engage  for 
any  state  for  any  informant  reliability.  We  only  consider  p  <  1 
scenarios;  larger  values  of  p  (imminent  attack  compared  to 
the  flow  of  tips)  correspond  to  blind  engagement  scenarios 
for  most  values  of  a. 

In  §6  we  derive  sufficient  conditions  to  engage  or  wait  that 
the  searcher  can  use  as  heuristic  policies.  Figure  6,  which 
has  the  same  structure  as  Figure  5,  illustrates  the  engage 
regions  generated  by  these  heuristics.  The  smooth  solid  line 
represents  the  optimal  engage-wait  boundary.  The  other 
three  (marked)  solid  lines  correspond  to  heuristics  based 
on  the  sufficient  conditions  to  engage  described  in  §6.1,  as 
explained  in  the  following: 

•  The  sufficient  condition  to  engage  in  (15),  corresponding 
to  perfect  detection  after  one  tip,  is  denoted  eng{\-tip)  and 
represented  by  the  -o-  curve. 
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Figure  6.  Engage  region  for  various  heuristic  policies  for  q  —  0.55  and  combinations  of  a  e  {0.5,  1.5}  and  p  e  {0.1,  1}. 
(a)  a  =  0.5  and  p  =  0.1  (b)  a  =  0.5  and  p  =  1 
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(c)  a  =  1 .5  and  p  =  0.1 


(d)  a  =  1 .5  and  p  =  1 


Pi 


•  Condition  (19),  corresponding  to  perfect  detection  after 
two  tips,  is  denoted  eng(2-tips)  and  represented  by  the  -x- 
curve.  As  discussed  in  §6.1,  condition  (19)  is  tighter  than 
(15)  and  thus  lies  closer  to  the  optimal  curve. 

•  Condition  (20),  which  we  derive  by  combining  cells 
2  and  3  into  an  uber-cell  and  using  the  two-cell  threshold 
policy,  is  denoted  eng  (2-cell  policy)  and  corresponds  to  the 
-V-  curve. 

Figure  6  also  contains  the  myopic  policy,  which  is  asso¬ 
ciated  with  the  wait  conditions  from  §6.2.  The  condition 
appears  in  (21)-(22)  and  we  denote  it  on  the  figure  as 
wait(myopic)  and  it  corresponds  to  the  — o—  curve. 

The  eng(\-tip)  heuristic  (-o-)  performs  poorly.  This  is 
not  surprising  considering  it  assumes  zero  cost  after  one 
tip.  The  eng(2-cell  policy)  rule  (-V-)  performs  reasonably 
well  overall.  In  situations  with  large  a  and  p  (Figure  6(d)), 
nearly  all  the  heuristics  produce  optimal  results. 


The  wait(myopic)  heuristic  performs  very  well  except 
for  small  values  of  a  and  p  (Figure  6(a)).  In  such  “low- 
cost-of-attack,  low-risk-of-attack”  scenarios,  the  searcher 
gains  significant  benefits  from  waiting  for  several  additional 
tips,  and  wait(myopic)  fails  to  account  for  this.  “Murky” 
states  with  limited  situational  awareness  lie  at  the  northwest 
region  of  the  state  space,  whereas  “clear”  states  with  a  strong 
best  candidate  cell  lie  in  the  southeast.  If  wait(myopic) 
recommends  to  engage  in  a  murky  state,  engaging  usually 
is  the  optimal  policy.  However,  this  policy  may  produce 
the  wrong  decision  in  clear  states  for  small  values  of  p. 
For  example  consider  the  state  p  =  (0.70,  0.20,  0.10)  in 
Figure  6(a).  Intuitively,  engaging  seems  like  the  right  decision 
for  this  state  because  cell  1  is  a  strong  candidate  for  the 
target’s  location.  Indeed,  wait(myopic)  recommends  to 
engage  in  this  state.  However,  because  p  is  small,  the 
searcher  can  afford  to  collect  several  more  tips  to  strengthen 
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situational  awarenesses  and  the  optimal  policy  recognizes  it: 
the  optimal  engage  region  lies  significantly  to  the  southeast 
of  p  =  (0.70,  0.20,  0.10)  in  Figure  6(a). 

We  also  examine  how  much  the  cost  increases  using  a 
heuristic  instead  of  the  optimal  policy  by  generating  84,000 
scenarios  representative  of  the  examples  in  Figures  5  and  6 
for  0.35  ^  q  ^  0.95,  0.5  ^  a  ^  1.5,  0.1  ^  p  ^  1,  over  the 
entire  state  space  for  p.  The  myopic  policy  performs  very 
well;  on  average  it  is  within  1%  of  optimal.  Figure  6(a) 
illustrates  when  the  myopic  policy  can  produce  a  cost  signif¬ 
icantly  greater  than  optimal:  small  p  and  a  and  moderate  q 
and  /?,.  There  is  no  benefit  to  one  additional  tip,  but  rea¬ 
sonable  cost  reduction  can  occur  through  several  additional 
tips.  The  strong  performance  of  the  myopic  policy  also 
occurs  for  n  >  3  as  long  as  p  is  not  too  small  (i.e.  ,  p  >  0.1). 
See  Appendix  J.l  for  a  more  thorough  analysis  of  several 
heuristics  for  both  n  =  3  and  n  >  3  scenarios.  These  results 
suggest  that  not  only  can  the  searcher  confidently  use  the 
myopic  policy  operationally  in  most  scenarios,  but  the  policy 
may  provide  a  rough  estimate  of  the  cost  to  wait,  which  is 
analytically  difficult  to  compute.  In  practice,  if  the  cost  to 
wait  is  only  slightly  smaller  than  the  cost  to  engage,  the 
searcher  may  still  choose  to  engage  because  of  uncertainties 
associated  with  the  model  parameters  or  other  frictions  we 
do  not  account  for  in  the  model.  In  Appendix  J.2  we  explore 
this  idea  further. 

7.2.  Impact  of  Number  of  Cells 

Following  the  discussion  in  §5,  we  observe  that  the  situation 
seems  to  improve  for  the  searcher  as  the  number  of  cells 
n  increases  because  it  becomes  less  likely  that  incorrect 
tips  will  cluster  on  one  particular  cell,  leading  the  searcher 
astray.  Figure  7  displays  the  relationship  between  the  optimal 
cost  C(p )  and  n  for  various  values  of  q  and  two  scenarios 
regarding  an  attack:  (a)  low-cost,  low-risk  (Figure  7(a)) 


and  (b)  high-cost,  high-risk  (Figure  7(b)).  These  figures 
illustrate  that  increasing  n  may  generate  only  minor  benefits, 
and  the  cost  may  actually  increase  in  certain  situations. 
The  slope  of  the  curve  depends  upon  one  of  three  possible 
policies  taken  by  the  searcher: 

1.  Blind  engagement  scenario:  searcher  engages  a  cell 
uniformly  at  random  incurring  cost  of  (n  —  1  )/n. 

2.  The  searcher  obtains  one  tip  and  engages  the  cor¬ 
responding  cell,  which  incurs  cost  (p/(l  +  p))a  + 
(1/(1+P))(1-?). 

3.  The  searcher  obtains  at  least  two  tips. 

For  option  1  the  searcher  prefers  a  small  n,  the  option  2 
cost  is  independent  of  n,  and  intuitively  the  cost  should 
decrease  with  n  for  option  3.  In  the  high-cost,  high-risk 
scenario  in  Figure  7(b),  the  searcher  chooses  either  option  1 
(when  the  curves  increase)  or  option  2  (when  the  curves 
flatten  out).  For  small  a  and  p  (Figure  7(a)),  the  searcher 
chooses  either  option  2  or  3.  Even  though  the  cost  is 
nonincreasing  with  n  in  Figure  7(a),  the  cost  significantly 
decreases  for  only  moderate  values  of  q  and  the  curves 
flatten  out  quickly. 

8.  Extensions 

In  our  model  we  make  several  assumptions  that  may  not 
apply  in  reality.  Our  objective  is  to  gain  insight  through 
analysis  of  a  relatively  simple  setting.  Several  extensions 
are  possible,  and  the  key  to  handling  them  is  to  properly 
modify  the  cost  function  (11)  such  that  most  of  the  results 
from  §§3-6  generalize  in  a  natural  way.  Because  of  space 
considerations,  we  only  present  one  extension  in  this  section. 
Appendix  L  considers  several  others.  The  main  extension 
we  analyze  here  focuses  on  the  situation  where  the  search 
continues  if  the  searcher  chooses  the  wrong  cell.  In  this  case, 
the  target  does  not  rush  his  attack  if  the  searcher  chooses  the 


Figure  7.  Optimal  cost  in  the  uniform  state  as  a  function  of  n  for  q  e  {0.21, 0.35,  0.55,  0.75,  0.95}  for  two  combinations 
of  a  and  p. 

(a)  a  =  0.5  and  p  =  0.1  (b)  a  =  1.5  and  p  =  1 
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wrong  cell  and  only  executes  a  mature  attack.  In  Appendix  L 
we  consider  the  situation  where  one  source  generates  a 
stream  of  correlated  tips.  In  that  case  future  tips  become 
less  valuable.  We  also  examine  the  situation  where  there  is 
no  target  and  the  searcher  has  the  option  to  end  the  search 
before  an  engagement.  Other  extensions  allow  for  multiple 
classes  of  informants  and  nonexponential  distributions  for 
the  time  until  the  target  executes  the  attack. 


8.1.  Search  Continues  Following  an  Incorrect 
Engagement 

In  some  situations,  when  the  target  is  either  oblivious  to  the 
searcher’s  failed  attempt  or  determined  to  wait  until  the  plot 
matures,  the  search  may  continue  following  the  engagement 
of  an  empty  cell.  Because  the  target  is  static  and  detection 
is  perfect,  the  searcher  can  discard  evidently  empty  cells 
from  future  consideration.  Specifically,  p;  —  0  following  an 
engagement  of  an  empty  cell  j.  The  cost  of  engaging  cell  j 
incorrectly  is  c  .  Because  we  allow  the  false  positive  cost 
to  vary  by  cell,  the  searcher  may  opt  to  engage  cells  with 
a  small  location  probability  if  c  is  also  small,  in  order  to 
eliminate  the  cell  from  further  consideration.  Rather  than 
use  the  cost-ratio  a ,  in  this  subsection  we  include  separate 
parameters  for  the  false  positive  cost  (c.)  and  the  damage 
from  a  mature  attack  ( d ). 

The  system  now  has  two  types  of  state  transitions.  The 
first,  as  before,  occurs  when  a  tip  points  at  cell  in  which 
case  state  p  transitions  to  state  p(+,).  The  second  (new) 
type  occurs  when  the  searcher  incorrectly  engages  cell  j, 
and  the  state  p  transitions  to  state  p{~j)  in  which  pj  =  0. 
The  set  A(p)  =  {/:  p,  >  0}  represents  the  “active"  cells 
(i.e.,  cells  that  have  not  been  incorrectly  searched  yet).  The 
informant  is  aware  of  the  searcher’s  failed  engagements  and 
therefore  refrains  from  pointing  at  these  cells  in  future  tips. 
The  probability  mass  associated  with  an  evidently  empty 
cell  is  proportionally  redistributed  among  the  active  cells. 
That  is. 


P [informant  says  i\  p,  target  in  k ] 


_ q _ 

q  +  (\Mp)\  -!)(!-  q)/(n-\) 

_ (1  ~  g)/(«  ~  1) _ 

q+(\Mp)\  -!)(!-  ?)/(«-!) 


if  i  —  k 


if  i  ^  k. 


Under  this  reasonable  assumption  the  ratio  y  between  the 
probabilities  of  correct  and  incorrect  tips  remains  unchanged, 
and  therefore  pi  ~,]  is  computed  as  in  Equation  (5).  If  cell  i 
is  searched  and  found  empty,  then 


*i-°  = 


Yk^i  Pk 


if  j  =  i 


if  j  ^  i. 


Next,  we  slightly  modify  the  definition  of  rt(p)  from  (7)  to 
ensure  that  Y"=\  ri(p)  —  1-  Specifically, 


r,(p)  = 


q 


q+(.\A(p)\-l)(l-q)/(n-l) 
(1  -<?)/(«-!) 


4+ (l^(p)l- 1)0 -<?)/(«- 1) 

if  ieA(p) 

0  if  i^A(p). 


(1  ~Pi) 


Although  the  expected  cost  to  wait  remains  essentially  the 
same  as  in  the  original  model,  the  expected  cost  to  engage 
becomes: 


E[Cost  of  engaging  cell  j\p]  =  (l-  Pj)(cj  +  C(p(  j))). 
The  updated  cost  function  is: 


C(p)  —  mini  min  ((l  -  pj)(cj  +  C(p{ 

\jeA(p) 


V~d+T~  Y,  r,(P)C(Pi+,))\  (23) 

l  +  P  l+Pi,MP)  / 


Obviously,  if  only  one  active  cell  remains  (| A(p)  =  1|), 
C(p )  =  0  because  the  searcher  knows  the  only  remaining 
cell  contains  the  target. 

The  analysis  of  the  cost  function  and  engage  decision  is 
similar  to  the  analysis  in  §3-7.  First  consider  the  case  of 
imminent  threat  where  the  searcher  does  not  wait  for  tips 
but  continuously  engages  cells  until  he  finds  the  target.  This 
is  the  classical  whereabouts  search  problem  (Kadane  1971, 
Stone  1975)  for  which  the  optimal  policy  is  to  search  the 
cells  in  ascending  order  of  the  ratios  Cj/pj,  j  =  1, . . . ,  n.  Let 
g  ( i )  denote  the  index  of  the  /th  smallest  value  of  Cj/pj  in 
A(p).  Thus,  g(l)  and  g(|A(p)|)  are  the  indices  of  the  cells 
with  the  smallest  and  largest  ratios  Cj/pj,  respectively.  Let 
K(p)  denote  the  cost  of  this  policy.  In  the  Appendix  K  we 
show  that 

\Mp)\  J- 1 

K(P)=  E  PgU)J2cgq)-  (24) 

j=2  i=  1 

The  searcher  should  engage  a  cell  if  K{p)  ^  (p/(l+  p))d. 
If  that  engaged  cell  is  empty,  this  condition  may  not  hold  in 
the  next  state.  It  is  most  reasonable  (albeit,  not  proved)  that 
the  searcher  should  engage  cell  g(l). 

K(p)  also  plays  a  crucial  role  in  the  sufficient  condition 
to  wait 

wait  if  mmCj(l-pj)>-f-d+-^—  £  ri(p)K(p(+i)). 
m  1+p  l +Pi£p) 


Note  that  computing  K(p(-+l) )  requires  ranking  according  to 
Cj/pf  \  which  depends  on  i. 
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9.  Summary  and  Conclusions 

In  this  paper  we  study  a  time-critical  variant  of  the  where¬ 
about  problem  in  search  theory.  This  variant  applies  to  many 
criminal,  military,  and  homeland  security  situations  where 
an  investigation  team  must  decide  when  to  act  on  uncertain 
intelligence.  Examples  include  counterterror  and  counterin¬ 
surgency  operations,  which  rely  on  human  intelligence  and 
intercepted  communications.  Unlike  the  original  whereabout 
model  that  produces  a  sequencing  rule,  we  consider  here 
a  stopping  rule;  rather  than  advising  the  searcher  how  to 
optimally  sequence  the  search  among  the  various  cells,  our 
model  identifies  the  time  when  the  information  is  sufficiently 
definitive  to  act  upon.  Either  action — engage  or  wait  for  addi¬ 
tional  information — incurs  costs.  We  analytically  solve  the 
two  extremes:  the  two-cell  case  uses  a  threshold  policy  and 
the  searcher  chooses  among  three  options  in  the  infinite-cell 
case.  We  also  illustrate  how  the  engage  region  of  the  state 
space  varies  with  the  model  parameters  for  the  three-cell 
case.  For  larger  problems,  we  use  a  k-stage  look-ahead 
approach  to  obtain  sufficient  conditions  to  engage  or  wait. 
We  show  that  these  conditions  converge  to  a  necessary  and 
sufficient  condition  to  engage  as  k  increases.  In  particular 
for  k  =  1,  the  myopic  policy  provides  nearly  optimal  results 
over  a  broad  range  of  parameter  values.  The  model  clearly 
captures  the  trade-offs  among  the  various  components  of  the 
threat:  the  mean  time  until  the  plot  matures,  the  flow  rate  of 
tips,  and  the  damages  associated  with  failed  searches  and 
successful  attacks.  We  present  several  variants  of  the  model 
in  §8  and  Appendix  L  to  capture  alternative  scenarios.  These 
include  the  search  continuing  after  an  incorrect  engagement, 
multiple  types  of  informants,  and  nonexponential  attack  time 
distributions.  Most  of  the  analysis  and  methods  discussed 
apply  to  these  extensions. 

Some  of  our  main  results  are  intuitive:  the  searcher  is 
more  likely  to  wait  with  a  more  reliable  informant  and  is 
more  likely  to  engage  as  the  cost  or  likelihood  of  a  mature 
attack  increases.  Less  intuitive  insights  that  emerge  from 
our  analysis  include  the  following:  (1)  the  optimal  number 
of  tips  received  by  the  searcher  may  not  be  monotone  as 
a  function  of  the  informant  reliability  (see  §4)  and  (2)  in 
many  cases  there  is  little  to  no  reduction  in  the  optimal  cost 
as  we  increase  the  number  of  cells  (see  §7.2). 

Future  work  could  model  the  reliability  parameter  q  as 
a  random  variable  (e.g.,  beta  distributed),  which  updates 
as  the  searcher  receives  more  information.  This  would  be 
particularly  appropriate  in  the  situation  where  the  target 
only  executes  his  attack  when  it  fully  matures  (see  §8.1). 
In  this  case  the  searcher  could  search  multiple  cells  and 
thus  verify  the  reliability  of  the  informant.  Another  variant 
would  capture  strategic  behavior  of  the  target  who  trades  off 
a  more  effective  attack  that  needs  longer  planning  time  with 
the  increased  risk  of  detection  by  the  searcher.  Finally,  one 
could  examine  another  time-critical  situation  where  the  target 
may  leave  instead  of  executing  an  attack  (e.g.,  a  criminal 
or  terrorist  leader  who  moves  around  to  avoid  detection). 
In  this  case  the  searcher  has  three  options:  receive  another 


tip,  engage  a  cell,  or  call  off  the  search  because  the  target 
has  likely  left  the  system.  The  modeling  of  this  situation 
may  include  changepoint  analysis  (Carlstein  et  al.  1994)  to 
handle  the  change  in  tip  dynamics  after  the  target  departs. 

Supplemental  Material 

Supplemental  material  to  this  paper  is  available  at  http://dx.doi 
.org/10.1287/opre.2016.1488. 
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